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Abstract 



Many mechanisms for the emergence and maintenance of altruistic behavior in social dilemma 
situations have been proposed. Indirect reciprocity is one such mechanism, where other- 
regarding actions of a player are eventually rewarded by other players with whom the original 
player has not interacted. The upstream reciprocity (also called generalized indirect reciprocity) 
is a type of indirect reciprocity and represents the concept that those helped by somebody will 
help other unspecified players. In spite of the evidence for the enhancement of helping be- 
havior by upstream reciprocity in rats and humans, theoretical support for this mechanism is 
not strong. In the present study, we numerically investigate upstream reciprocity in heteroge- 
neous contact networks, in which the players generally have different number of neighbors. We 
show that heterogeneous networks considerably enhance cooperation in a game of upstream 
reciprocity. In heterogeneous networks, the most generous strategy, by which a player helps 
a neighbor on being helped and in addition initiates helping behavior, first occupies hubs in 
a network and then disseminates to other players. The scenario to achieve enhanced altruism 
resembles that seen in the case of the Prisoner's Dilemma game in heterogeneous networks. 
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1 Introduction 



The mechanism for evolution and maintenance of altruism when egoistic behavior is apparently 
more advantageous has been a target of intensive studies. Among the many viable mechanisms 
proposed, we focus on indirect reciprocity, which refers to the concept that a cooperative player 
is helped by others with whom she/he has not interacted. Cooperative behavior is indirectly 
rewarded by way of chains of helping behavior of various players. There are two types of indirect 



reciprocity: downstream reciprocity and upstream reciprocity ( Nowak and Sigmund, 2005 ). In 
downstream reciprocity, a player witnesses the behavior of other players as a third party. The 
observing player will assign a good reputation to player X if player X helps others. When a situ- 
ation arises where this observer interacts with player X in the future, the observer will probably 
help X if and only if X has a good reputation. A player must establish a good reputation by 
helping others prior to being helped by other anonymous players. The downstream reciprocity is 



observed in behavioral experiments (Wedekind and Milinski, 2000 Milinski et al., 2002) and is 



firmly based on the theory of evolutionary games (Nowak and Sigmund, 1998a Nowak and Sigmund, 199 



Leimar and Hammerstein, 2001 Brandt and Sigmund, 2004 Ohtsuki and Iwasa, 2004 Ohtsuki and Iwa 



In upstream reciprocity, the players first get help from other players. If the recipient 
complies with upstream reciprocity, then she/he helps another unspecified player. Theo- 
retically, evolution of cooperation based on upstream reciprocity is considered to be diffi- 
cult. In numerical simulations, cooperation is achieved only when the size of the interac- 
tion group is small (Boyd and Richerson, 1989 Pfeiffer et al., 2005). An analytical study 
showed that upstream reciprocity enables evolution of cooperation only in combination with 
another mechanism such as direct reciprocity {i.e., repeated interaction between the same 
players) or spatial reciprocity {i.e., interaction between players on a one-dimensional lat- 



tice) ( Nowak and Roch, 2007 ). However, upstream reciprocity has been observed in behav- 
ioral experiments conducted on humans. A player that has received a help from another 
player has increased the propensity to help an anonymous partner in variants of the trust 



3 



game ( Dufwenberg et al., 2001 Greiner and Levati, 2005 Stanca, 2009 ). Those who are helped 
by somebody in advance tend to help another partner filling in a tedious survey in laboratory 
behavioral experiments ( Bartlett and DeSteno, 2006 ). Upstream reciprocity has also been ob- 
served in rats. Rats trained to pull a stick to deliver food tend to pull the stick to help another 



rat after receiving food via a help from a conspecific ( Rutte and Taborsky, 2007 ). Therefore, 
theoretically assessing the conditions under which upstream reciprocity is feasible will help us 
gain a better understanding of the evolution of cooperation in social dilemma situations. 

In this study, we examine the effect of a property of contact networks on upstream reci- 
procity. A fundamental characteristic of many social networks is that the number of contacts 
of a node, which we call the degree, has a right-skewed distribution. In particular, scale-free 
networks, i.e., networks with power-law degree distributions are widely found (e.g., Newman, 
2003). In social networks relevant to evolutionary games, scale-free networks have been found 



in, for example, email social networks ( |Ebel et al., 2002{ [Newman et al., 2002] ). Although other 
social networks do not exhibit degree distributions that are as right skewed as the power-law 



distribution, their degree distributions are considerably heterogeneous (Eubank et al., 2004 



Lusseau and Newman, 2004 Kossinets and Watts, 2006 Onnela et al., 2007). We investigate 



the effect of heterogeneous degree distributions on the possible evolution of cooperation based 
on upstream reciprocity. 

We show that upstream reciprocity enhances altruistic behavior of players that are placed in 
heterogeneous contact networks such as scale-free networks. The mechanism found in our study 
has resemblance to that for enhanced cooperation shown in the Prisoner's Dilemma in het- 



erogeneous networks (Duran and Mulet, 2005 Santos and Pacheco, 2005 Santos et al., 2006 



Santos and Pacheco, 20061), which we will discuss in Sec. HI 
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2 Model 
2.1 Networks 

Consider a contact network with a population of = 10000 players. As a model of het- 
erogeneous network, we use the scale-free network generated by the Barabasi-Albert algo- 



rithm ( Barabasi and Albert, 1999 ) (Fig. [TJA). To generate the scale-free network, we start with 
the complete graph of 2m + 1 nodes {i.e., each pair of nodes is connected by an edge). Then, 
we add nodes with degree m one-by-one according to the so-called linear preferential attach- 
ment; the probability that an already existing node Vi forms an edge with a newly intro- 
duced node is proportional to the degree fcj. Multiple edges {i.e., more than one edge con- 
necting a pair of nodes) are disallowed. In the generated network, the degree follows the 
power-law distribution p{k) oc with a lower cutoff at k = m and the mean degree of 



(k) = 2m (Barabasi and Albert, 1999). We use (k) = 8, i.e., m = 4, unless otherwise stated. 



For comparison, we also use four other types of networks. One is the regular random graph. 



which is constructed from the configuration model ( Newman, 2003 ) (Fig. [IJ3)- To generate a 
network, we attach (k) stubs, or half edges, to each node. Then, we randomly select two nodes 
with the equal selection probability and connect them. These two nodes consume one stub each. 
We repeat this procedure until all stubs are exhausted at all nodes. If the generated network 
is disconnected or contains self-loops or multiple edges, we discard the network and start the 
entire procedure all over again. Although its mean degree is small, the regular random graph 
represents a well-mixed population in which cooperation is not easily enhanced by upstream 



reciprocity (Boyd and Richerson, 1989 Nowak and Roch, 2007) 



In the square lattice, = 10000 nodes are placed on the square with a linear length of 
^A^ = 100. Each node is connected to eight nodes situated in a so-called Moore neighborhood 
(Fig. HP). We adopt the periodic boundary condition. 

The extended cycle is a one-dimensional network, where the nodes are placed on a ring. 
Each node is connected to (k) /2 nearest nodes on each side, as shown in Fig. [Tp. 
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The scale-free network, the regular random graph, the square lattice, and the extended cycle 
have {k) = 8 unless otherwise stated. Therefore, we can compare the effects of different types 
of networks without having to account for the possible influence of (k). We also set (k) = 6 
and (k) = 14 in some of the following numerical simulations to confirm the robustness of the 
results with respect to {k). 

The final type of network used is the cycle in which each node on a ring is connected to a 
single nearest node on each side such that (k) = 2 (Fig. [IJ5). We use the cycle to compare our 
numerical results with the previously reported theoretical results ( Nowak and Roch, 2007D . In 



contrast to the well-mixed population, the infinite one-dimensional chain network with (k) = 2 
enables upstream reciprocity because it exhibits spatial reciprocity. Spatial reciprocity is a 
general mechanism for evolution of cooperation in social dilemma games; cooperative play- 
ers are clustered in a network to help each other and resist the invasion by egoistic play- 



ers dAxelrod, 1984[ |Nowak and May, 1992D . Such clustering is possible when the size of the 
boundary of a cluster is small relative to the number of players in the cluster. This situation 
is expected the most in the cycle and to a certain extent in the extended cycle and the square 
lattice; however, it is not expected in the Barabasi-Albert scale-free network and the regular 
random graph. 

2.2 Game of upstream reciprocity: rule and payoff 



A single game of upstream reciprocity (Nowak and Roch, 2007), which is motivated by experi 



mental evidence and previous theoretical work explained in Sec.[T], is described as follows. First, 
a player Vi {1 < i < N) is selected. Player Vi may initiate a chain of helping behavior. If Vi 
does so, Vi bears the cost c and selects one of its neighbors at an equal selection probability 
of 1/ki, where ki is the degree of Vi. The selected neighbor, denoted by vj, receives the payoff 
b. We assume 6 > c > so that the game represents a social dilemma; a single act of help 
increases the average payoff of the entire population by {b — c)/N, while each player is better 
off by not helping other players. Without loss of generality, we set c = 1. 
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Vj may not continue the chain of helping behavior. In such a case, the chain of cooperation 
terminates, and the payoffs for Wj, Vj, and f^/ {i' 7^ are equal to — c, 6, and 0, respectively. 
However, if Vj does pass on the helping action, Vj selects one of its neighbors at a probability 
of 1/ kj and bears the cost c. The selected neighbor receives h. The chain of helping behavior 
continues until a recipient of help terminates the chain. Note that a chain of cooperation may 
traverse the same players more than once. 

2.3 Strategies 



On the basis of a previous study (Nowak and Roch, 2007), we specify the strategy of each 



player Vi {1 < i < N) using two parameters. The first parameter Pi (0 < < 1) denotes the 
probability that vi passes on the helping action to a randomly selected neighbor after receiving 
it from a neighbor. The second parameter qi (0 < < 1) denotes the probability that Vi 
initiates the helping action. A larger pi or qi implies that player Vi is more cooperative. 

We consider the following four strategies that were introduced by Nowak and Roch (2007): 

• Classical defector (CD) is defined by = and qi = 0. CD neither initiates nor passes 
on the help. It is the most egoistic strategy. 

• Classical cooperator (CC) is defined hj Pi = and q-i = 1. CC spontaneously initiates the 
chain of helping behavior but does not react to the cooperation that it receives from a 
neighbor. CC does not contribute to upstream reciprocity, even though CC is cooperative 
to some extent. 

• Generous cooperator (GC) is defined by pi = 0.8 and qi = 1. GC initiates the helping 
behavior and passes on the helping action with a high probability. It is the most coopera- 
tive strategy. We are concerned with the possibility that heterogeneous networks enhance 
the fraction of GCs in a population. 

• Passer-on (PO) is defined by pi = 0.8 and qi = 0. PO does not initiate the helping 
behavior but passes on the helping action with a high probability. Although PO is less 
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cooperative than GC, it contributes to the upstream reciprocity. 

In the case of GC and PO, we set pi = 0.8 instead of pi = 1. This is to prevent a chain 
of helping behavior from continuing indefinitely if the population consists of only GC and PO. 
This choice of pi is arbitrary. To verify the robustness of our results with respect to the value 
of Pi, we will carry out some of the following numerical simulations with pi = 0.7 and Pi = 0.9. 

2.4 Update rule 

We principally use the deterministic update rule, which is described in the following. The 
numerical results do not qualitatively change on using relatively realistic stochastic rules, as 
shown in Sees. 13.11 and 13.41 




We refer to time in the evolutionary dynamics as a round and denote it by t (= 0, 1,2,...). 
One round consists of chains of helping behavior, and one chain is initiated by each player. 
Note that a chain is considered to be empty if the initial player does not help a neighbor, which 
occurs for CD and PO. The one-round payoff of player Vi is defined as the sum of the payoffs 
gained by Vi in N chains of cooperation. The payoff that Vi gains in a round is equal to 6 x 
(the frequency at which the chains are brought to fj) — c x (the frequency at which the chains 
are passed from Vi without being terminated). 

At the end of each round, the strategies of A^u out of the A^ = 10000 players are updated 
synchronously. Unless otherwise stated, we set A'u = 200. We also set A^^ = 20 and A"u = 2000 
in some of the following numerical simulations to examine the robustness of the results with 
respect to A^u. We randomly and independently select A^u players from the population with 
equal probability. In the deterministic update rule that we mostly use in this paper, for each 
selected player Vi, the neighbor with the largest payoff, which is denoted by Vj, is selected. If 
the payoff of vj is larger than that of Vi, vi will copy the strategy of Vj. If there are more 
than one neighbors with the same largest payoff, we select one of them randomly with equal 
probability. After tentatively determining A"u copying events, we replace the strategies of the 
selected nodes simultaneously. We do not assume mutation. This marks the end of one round. 
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One run lasts until a quasistationary state is attained or the unanimity of one strategy is 
almost achieved. Specifically, we set the number of rounds to 20000 in the case of the scale-free 
network, the regular random graph, and the square lattice. In the case of the extended cycle 
and the cycle, the number of rounds is equal to 140000. 

3 Results 

3.1 GC versus CD 

When a player passes on the received help to a neighbor, a neighbor is randomly selected as 
recipient with equal probability. A chain of helping behavior is equivalent to a simple random 
walk with random termination. If pj = 1 (1 < z < A^), the random walk may continue forever. 
In this hypothetical situation, the payoff that player i receives is proportional to the stationary 
density of the random walk. In any undirected network, the stationary density of the simple 
random walk is proportional to the degree {e.g., Noh and Rieger, 2004). This relation roughly 
holds true for uncorrelated networks even in the presence of some absorbing nodes at which 



the random walk terminates (Noh and Rieger, 2004). Therefore, we expect that the number of 



times that the chain of helping behavior reaches a given node is roughly proportional to the 
degree. Because a single passage of chain contributes to the payoff b — c > 0, the payoff per 
round for each player is roughly proportional to the degree. 

To verify this prediction, we carry out Monte Carlo simulations of the game of upstream 
reciprocity on the scale-free network with a random mixture of GCs and CDs. We set b = 1.5. 
The probability that each player is initially GC or CD is 0.5. Figure |2j\ shows the dependence 
of the payoff per round on the degree of the player, just before the first update {i.e., t = 0). 
Each data point corresponds to the payoff per round averaged over all players having the same 
degree and same strategy. For each strategy, the payoff per round is roughly proportional 
to the degree. CDs generally gain larger payoffs than GCs, because CDs exploit GCs in the 
neighborhood. 
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However, from Fig. |2JA., it cannot be concluded that CD takes over GC in the evolutionary 
time course. The same statistics are plotted at t = 200 in Fig. [23. As in the case of Fig. [2]A., 
CD gains more than GC at the same degree. At this stage, however, most hubs are occupied 
by GCs for the following reason. There are usually some GCs in the neighborhood of a GC 
hub, which is also the case under random initial condition. Then, the GC hub tends to gain a 
large payoff because GC neighbors help the GC hub. As a result of evolution, GC will spread 
from the hub to the neighbors, which further increases the payoff of the GC hub. Suppose 
a situation where CDs invade neighbors of the GC hub and exploit it. Because the degrees 
of these CDs are generally not large, the CDs cannot be helped by many players even if the 
neighborhood is occupied by GCs. Therefore, the CDs would not gain the payoff per round as 
large as that of the GC hub. Accordingly, GC tends to be stabilized at the hub. In contrast, 
if CD spreads from the hub to the neighbors, the CD hub will obtain a small payoff. Then, a 
GC in the neighborhood of the CD hub may take over the hub; CDs occupying hubs are not 
stabilized. GCs gradually spread from hubs to players having small degrees (Fig. and the 
entire network is eventually occupied by GCs after sufficient rounds (Fig. [2p). 

The time courses of the mean degree of GCs and that of CDs corresponding to the run 
shown in Fig. [21\-D are plotted in Fig. OH. First, the mean degree of GCs grows until most 
hubs are occupied by the GCs. It then relaxes to {k) = 8. The mean degree of the CDs is 
considerably smaller than (k) = 8 throughout the run. 

The time courses of the average payoff per round of GCs and that of CDs, corresponding 
to the same run as above, are shown in Fig. [2f . Initially, the two average payoffs decrease 
because CDs replace GCs. Then, GCs are stabilized at hubs, and the GCs begin to disseminate 
to increase the average payoff of both GCs and CDs. At any t, CDs earn more than GCs on 
an average. However, this does not imply that CD invades GC macroscopically. As shown 
in Fig. [2]B-C, the players with the largest payoffs are GC hubs rather than CDs. A player is 
chosen as a potential parent to be mimicked by other players with the probability proportional 



to its degree (Newman, 2003 Noh and Rieger, 2004). In the scale-free network, a neighbor 
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of an arbitrary player tends to be a hub, and then GC hubs are imitated by relatively many 
players. Therefore, while the average payoff of CDs is maintained at a larger value than that 
of GCs, the fraction of CDs gradually decreases until the CD becomes extinct. The relative 
strength of a strategy in reproduction is determined not by the average payoff of the players 
using that strategy but by the degree-weighted average payoff of these players. 

The scenario of evolution of helping behavior described above requires heterogeneous degree 
distribution. To compare different networks, at a given value of b, we generate five realizations 
of the network and carry out 10 runs on each network for the scale- free network and the 
regular random graph, which are generated from stochastic algorithms. For the other three 
deterministic networks, we carry out 50 runs on the network. The average of the final fraction 
of GC, obtained from the 50 runs, is plotted against b in Fig. |3l In all the networks, except 
the regular random graph, the fraction of GC increases with b. In fact, the fraction jumps 
from unanimity of CD to that of GC at a threshold value of b. The threshold value of b above 
which GCs survive is considerably smaller in the scale-free network than in the other networks. 
Heterogeneous networks promote the evolution of helping behavior. Among the other networks, 
the threshold value of b is the smallest in the cycle. The next smallest value is the extended 
cycle and then the square lattice. The threshold value of b in the random graph is greater than 
the upper limit shown in Fig. 12] {i.e., b = 10). Unlike the Barabasi-Albert scale-free network 
and the regular random graph, the other three networks, i.e., the cycle, the extended cycle, 
and the square lattice, are capable of spatial reciprocity. This fact explains why these three 
networks accommodate more GCs as compared to the regular random graph. However, the 
effect of spatial reciprocity is smaller than the effect of the scale-free networks, at least under 
the present parameter regime. 

We confirm that the results are qualitatively the same for some variations of the model. 
First, we change the mean degree to (k) = 6 (Fig. HJA.) and (k) = 14 (Fig. |4f3). The results are 
qualitatively the same as those for (k) = 8. Quantitatively, GC survives more easily for a smaller 
(k), which coincides with the results for the Prisoner's Dilemma on regular random graph 
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qOhtsuki et al., 2006[ ). Second, we change Pi for GC and PO to 0.7 (Fig. and 0.9 (Fig. E]B). 
The results are quahtatively the same as those for pi = 0.8. Quantitatively, a larger value of Pi 
yields a larger fraction of GC. Third, we change the number of players updated in one round 
to = 20 (Fig. |6l\) and = 2000 (Fig. [6f3). The results are qualitatively the same as those 
for A^u = 200. Fourth, we show the effect of different stochastic update rules. In the imitation 
rule ( Qhtsuki et al., 2006 ), at the end of each round, potentially updated player Vi selects a 
potential parent out of the ki + 1 players, i.e., Vi and the ki neighbors of Vi. The probability 
that a node is selected as the parent is proportional to the payoff. When the payoff is negative, 
we set this probability to zero. In the Fermi rule {e.g., Szabo and Toke, 1998; Traulsen et al., 
2006), Vi selects a potential parent vj out of the ki neighbors with equal probability and copies 
the strategy of vj with probability [1 + exp(/3( (payoff of player Vi) — (payoff of player Vj)))]^'^. 
Otherwise, Vj copies the stragegy of Vi. The results for the imitation rule and those for the 
Fermi rule with /3 = 0.2 are shown in Figs. [7JA. and [33, respectively. The results resemble 
those for the deterministic update rule. Although the one-dimensional chain allows for GC 
at small values of b, as comparable or even smaller than the values for the scale-free network, 
our main result that heterogeneous networks enhances generous cooperators as compared to 
homogeneous networks is not violated. 

In the case of the cycle, the threshold value of b above which the GC survives the in- 
vasion by CD has been obtained for a different update rule in the limit of weak selection 



( Nowak and Roch, 2007 ). The survival of the GC is possible when b/c > f{p), where f{p) = 
8 -|- 2p -|- 8^1 — / 3 + 4p + Y^l — p2 . Because we set c = 1 and p = 0.8, the theoreti- 
cal threshold in this case is equal to /(0.8) = 2.12. Figure |3] indicates that the GC survives 
when b is larger than approximately 2.6 in the numerical simulations; this value is not too far 
from the theoretical value. The discrepancy between the theoretical and numerical results is 
probably attributed to the use of different update rules (stochastic versus deterministic), the 
difference in selection pressure (weak selection versus strong selection), and/or the difference 
in the boundary condition of the network (open end versus periodic boundary condition). 
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3.2 GC versus CC 

Next, we examine the case in which GCs and CCs are initially present. Although GC and CC 
are both cooperative in a classical sense, the GC is more cooperative than the CC in a game 
of upstream reciprocity. Similar to the case considered in Sec. 13. we start each Monte Carlo 
simulation using an equal fraction of GCs and CCs. In contrast to a population composed of 
GCs and CDs, in this case, the unanimity of GC or that of CC, instead of a mixture of GC and 
CC, is reached very often in the final round of runs in the scale-free network and the square 
lattice. This unanimity is attained even if the number of rounds is set to a small value. If all 
runs end up at unanimity, the fraction of GC is equal to the fraction of runs in which unanimity 
of the GC is reached. This quantity is discretized by the number of runs. Therefore, we carry 
out 100 runs in the scale-free network and the square lattice to overcome the discretization 
effect. In the other networks, we carry out 50 runs as in the previous case. 

The final fraction of the GC in different networks is shown in Fig. [HI The scale-free network 
enhances the evolution of the GC to a greater extent than the other networks, except at large 
values of b. This result and the ordering of the five networks according to the threshold value of 
b above which the GC evolves are consistent with those obtained in the case of the population 
of GCs and CDs (Sec. 13. ip . The threshold value of b in the random graph is greater than the 
upper limit shown in Fig. [S] (i.e., b = 10). 

In the case of the cycle, it has been theoretically shown for the original model that the GC 



survives the invasion by CC when b/c > /(0.8) = 2.12 ( Nowak and Roch, 2007 ). In Fig. [HI 
the GC survives in the cycle when b/c > 3.0, which is of the same order as the theoretically 
predicted value for the original model. 

3.3 GC versus PO 

In this section, we investigate the population composed of GCs and POs. Recall that, even 
though PO is cooperative in that it passes on helping behavior to a neighbor, the GC is more 
cooperative in comparison because it initiates a chain of helping behavior and PO does not. 
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The final fractions of the GC obtained from 50 runs in different networks are compared in 
Fig. ini Similar to the results reported in Sees. 13.11 and 13. 2^ the scale-free network yields the 
largest fraction of the GC. The ordering of the five networks according to the threshold value 
b is also consistent with those obtained in the population of GCs and CDs (Sec. 13. ip and that 
of GCs and GCs (Sec. [32]). 

Theoretically, GC survives for the original model in the cycle when b/c > g{p), where 
g{p) = p(s + 3p+ ^1 -p2j / (1 + 2p) (^1 + j9 - v^l -P^) ( INowak and Roch, 2007[ ). In 



our simulations, the threshold is estimated to be g{0-8) = 1.54. Figure |9] suggests that the 
threshold is about 1.5, which is close to the theoretical value for the original model. 

3.4 Populations comprising four strategies 

We examine the dynamics of a population in which all four strategies are initially present. 
Each player is assumed to adopt either strategy independently with probability 1/4. Similar 
to the case of the population of GCs and GCs, most runs end up at unanimity of one strategy 
in the scale-free network and the extended cycle. Therefore, we carry out 100 runs for these 
two networks to enhance the precision in the computed fraction of different strategies. For the 
other networks, we carry out 50 runs. 

The final fraction of each strategy in the five networks is shown in Fig. [TOl In the scale-free 
network (Fig. [TOK). CD and PC do not survive for any value of b. The fraction of GC increases 
with the value of b. In the regular random graph, the GC does not survive, and the network 
is almost entirely inhabited by the least cooperative players, i.e., CDs (Fig. [TUB). For GC to 
survive, the value of b larger than 10, which is the upper limit of b examined in Fig. [TUB, 
is required. In the square lattice (Fig. [TUG), the extended cycle (Fig. [TUP), and the cycle 
(Fig. [TUB). GC takes over CD at a sufficiently large value of b. The lowest to highest threshold 
value of b above which the GC survives follows the order of the scale-free network, the cycle, 
the extended cycle, the square lattice, and the regular random graph. 

The results are robust against various changes of the model, such as the value of (k) (Figs. HP 
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and D), the value of Pi (Figs. and D), the value of (Figs. [HP and D), and the update 
rule (Figs. [7p and D). The results in this section including the robustness results are consistent 
with those obtained for the populations that comprise two strategies (Sees. 13. ![ 13. 2[ and 13. 3p . 

4 Discussion 

We have shown that heterogeneous networks enhance cooperative behavior in a game of up- 
stream reciprocity. Based on the property of the simple random walk on networks, chains of 
helping behavior traverse hub players more often than players having small degrees. Then, 
hubs tend to gain a larger payoff. The most cooperative strategy (z.e., GC) is stable once it 
inhabits hubs, from where it spreads to the entire network. From a quantitative point of view, 
the impact of heterogeneous networks on enhancing altruism can be much more than that of 
spatial reciprocity in most cases. Our results are robust against variation in some parameters 
of the model ((fc), Pi, and A^u) and variation in update rules. 

The route to altruism in the game of upstream reciprocity proposed in this study is simi- 
lar to that in the Prisoner's Dilemma on heterogeneous networks (Santos and Pacheco, 2005 



Duran and Mulet, 2005 Santos et al., 2006 Santos and Pacheco, 2006). In this framework 



each player is assumed to either cooperate with or defect against all neighbors in a round. 
Once a cooperator occupies a hub and some surrounding nodes, the hub gains a large payoff 
and is likely to disseminate its offspring {i.e., cooperators) to the neighbors. This event further 
increases the payoff of the hub, and the cooperation on the hub is stabilized. In contrast, de- 
fection on a hub is not stable because the hub does not gain a large payoff if the defector hub 
disseminates its offspring to the neighbors. Cooperators are propagated from hubs to the entire 
network. In the game of upstream reciprocity in networks, suppose that a GC hub dissemi- 
nates its offspring to the neighbors. This hub will gain a larger payoff in the subsequent rounds 
because the neighbors will tend to pass on the chains of helping behavior. Then, the GC hub 
will receive helping behavior more often than typical players such that its payoff increases, and 
the GC is stabilized on the hub. This positive feedback is weaker in the case of the PO and 
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absent in the case of the CD and CC. 

When player X with a small degree copies the strategy of a successful hub neighbor Y, 
X may not gain a large payoff because X is not a hub. In the Prisoner's Dilemma on 
networks, many previous studies assumed that selection is based on the summed payoff; in 
this, each player sums up the payoff obtained by playing against all neighbors to determine 



the payoff per round (Santos and Pacheco, 2005 Duran and Mulet, 2005 Santos et al., 2006 



Santos and Pacheco, 2006). However, it may be advantageous for X not to copy the strategy 



of Y, because X is not as connected as Y. It may be more profitable for X to copy the strategy 
of a neighbor that earns a larger payoff per edge. This update rule corresponds to the selec- 
tion based on the average payoff, i.e., the summed payoff divided by the degree. The average 
payoff scheme does not enhance cooperation in the Prisoner's Dilemma on heterogeneous net- 



works ( Santos and Pacheco, 2006 Tomassini et al., 2007 ). This argument is also applicable to 
the game of upstream reciprocity in scale-free networks. The evolution of helping behavior is 
likely to be hampered if the selection is based on average payoff. This is a major limitation of 
the present study. The update rule that we have adopted, as well as the rule based on additive 
payoff used in the Prisoner's Dilemma, may represent a situation in which players are unaware 
of the degree of their neighbors. 

In the game of upstream reciprocity, hubs gain relatively large payoffs because a simple 
random walker visits hubs relatively often. This is true for an eternally lasting random walk 
on arbitrary undirected networks ( Noh and Rieger, 2004 ). However, in our model, the ran- 
dom walk terminates in finite time. Then, the random walker may visit specific non-hub 
nodes more frequently than it visits hubs, as in the case of the random walk in networks with 
an absorbing boundary ( Noh and Rieger, 2004 Newman, 2005 ). For heterogeneous networks 
in which populations are not well mixed, perhaps with degree correlation between adjacent 
nodes or global structure of networks, our results may be modified. The GC may spread from 
specific non-hub players. In directed networks, the frequency of visit of the random walker 



to nodes can also deviate from the predicted value based on the degree (Donato et al., 2004 
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Masuda and Ohtsuki, 2009 ). Roughly speaking, however, the random walk tends to visit more 
connected players under all discussed cases. Therefore, we expect that our results qualitatively 
hold true for general heterogeneous networks. 
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Figure 1: Architecture of networks. (A) Scale- free network, (B) regular random graph, (C) 
square lattice, (D) extended cycle, and (E) cycle. 
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Figure 2: Payoff per round for each strategy as a function of degree at (A) t = 0, (B) t = 200, 
(C) t = 800, and (D) t = 2400. (E) Time course of mean degree for GC and CD. (F) Time 
course of average payoff for GC and CD. We set (k) = 8, Pi = 0.8, and = 200. 
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Figure 3: Final fractions of GC in various networks when players initially adopt either GC or 
CD. We set {k) = 8, Pi = 0.8, and A^^ = 200. 
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Figure 4: Results for different values of {k). (A, B) Final fractions of GC in various networks 
when players initially adopt either GC or CD. We set (A) {k) = 6 and (B) (k) = 14. (C, D) 
Final fractions of four strategies when players initially adopt either GC, CD, CC, or PO in the 
scale-free network. We set (C) (k) = 6 and (D) (k) = 14. 
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Figure 5: Results for different values of Pi. (A, B) Final fractions of GC in various networks 
when players initially adopt either GC or CD. We set Pi for GC to (A) 0.7 and (B) 0.9. (C, D) 
Final fractions of four strategies when players initially adopt either GC, CD, CC, or PO in the 
scale-free network. We set pi for GC and PO to (C) 0.7 and (D) 0.9. 
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Figure 6: Results for different numbers of players updated per round. (A, B) Final fractions 
of GC in various networks when players initially adopt either GC or CD. We set (A) A'^u = 20 
and (B) A"u = 2000. (C, D) Final fractions of four strategies when players initially adopt either 
GC, CD, CC, or PO in the scale-free network. We set (C) = 20 and (D) A^^ = 2000. 
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Figure 7: Results for different update rules. (A, B) Final fractions of GC in various networks 
when players initially adopt either GC or CD. We use (A) imitation update rule and (B) Fermi 
update rule. (C, D) Final fractions of four strategies when players initially adopt either GC, 
CD, CC, or PO in the scale-free network. We use (C) imitation update rule and (D) Fermi 
update rule. 
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Figure 9: Final fractions of GC in various networks when players initially adopt either GC or 
PO. 
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Figure 10: Final fractions of four strategies when players initially adopt either GC, CD, CC, 
or PO in (A) scale-free network, (B) regular random graph, (C) square lattice, (D) extended 
cycle, and (E) cycle. We set {k) = 8, Pi = 0.8, and A^^ = 200. 
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